Issues in Building Practical Provenance Systems
نویسندگان
چکیده
The importance of maintaining provenance has been widely recognized, particularly with respect to highly-manipulated data. However, there are few deployed databases that provide provenance information with their data. We have constructed a database of protein interactions (MiMI), which is heavily used by biomedical scientists, by manipulating and integrating data from several popular biological sources. The provenance stored provides key information for assisting researchers in understanding and trusting the data. In this paper, we describe several desiderata for a practical provenance system, based on our experience from this system. We discuss the challenges that these requirements present, and outline solutions to several of these challenges that we have implemented. Our list of a dozen or so desiderata includes: efficiently capturing provenance from external applications; managing provenance size; and presenting provenance in a usable way. For example, data is often manipulated via provenanceunaware processes, but the associated provenance must still be tracked and stored. Additionally, provenance information can grow to outrageous proportions if it is either very rich or fine-grained, or both. Finally, when users view provenance data, they can usually understand a SELECT manipulation, but “why did the bcgCoalesce [1] manipulation output that?”
منابع مشابه
Provenance Issues in Platform-as-a-Service Model of Cloud Computing
In this paper we present provenance issues that arise in building Platform-as-a-Service (PaaS) model of cloud computing. The issues are related to designing, building, and deploying of the platform itself, and those related to building and deploying applications on the platform. These include, tracking of commands for successful software installations, tracking of inter-service dependencies, tr...
متن کاملContour Crafting Process Plan Optimization Part I: Single-Nozzle Case
Contour Crafting is an emerging technology that uses robotics to construct free form building structures by repeatedly laying down layers of material such as concrete. The Contour Crafting technology scales up automated additive fabrication from building small industrial parts to constructing buildings. Tool path planning and optimization for Contour Crafting benefit the technology by increasin...
متن کاملCopyright and Provenance: Some Practical Problems
Copyright clearance is an increasingly complex and expensive impediment to the digitization and reuse of information. Clearing copyright issues in a reliable and cost-effective manner for works created in the last 100 years can involve establishing complex provenance chains for the works, their copyrights, and their licenses. This paper gives an overview of some of the practical provenance-rela...
متن کاملOn building Information Warehouses
One of the most important goals of information management (IM) is supporting the knowledge workers in performing their works. In this paper we examine issues of relevance, linkage and provenance of information, as accessed and used by the knowledge workers. These are usually not adequately addressed in most of the IT based solutions for IM. Here we propose a non-conventional approach for buildi...
متن کاملAdvances and Challenges for Scalable Provenance in Stream Processing Systems
While data provenance is a relatively well-studied topic in both the fields of databases and workflow systems, its support within stream processing systems presents a new set of challenges. Given the potentially high event rate of the input streams and the low processing latency requirements imposed by many streaming applications, capturing data provenance effectively in a stream processing sys...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Data Eng. Bull.
دوره 30 شماره
صفحات -
تاریخ انتشار 2007